Evaluating the use of non‐invasive hair sampling and ddRAD to characterize populations of endangered species: Application to a peripheral population of the European mink

Abstract The application of next‐generation sequencing (NGS) to non‐invasive samples is one of the most promising methods in conservation genomics, but these types of samples present significant challenges for NGS. The European mink (Mustela lutreola) is critically endangered throughout its range. However, important aspects such as census size and inbreeding remain still unknown in many populations, so it is crucial to develop new methods to monitor this species. In this work, we placed hair tubes along riverbanks in a border area of the Iberian population, which allowed the genetic identification of 76 European mink hair samples. We then applied a reduced representation genomic sequencing (ddRAD) technique to a subset of these samples to test whether we could extract sufficient genomic information from them. We show that several problems with the DNA, including contamination, fragmentation, oxidation, and possibly sample mixing, affected the samples. Using various bioinformatic techniques to reduce these problems, we were able to unambiguously genotype 19 hair samples belonging to six individuals. This small number of individuals showed that the demographic status of the species in this peripheral population is worse than expected. The data obtained also allowed us to perform preliminary analyses of relatedness and inbreeding. Although further improvements in sampling and analysis are needed, the application of the ddRAD technique to non‐invasively obtained hairs represents a significant advance in the genomic study of endangered species.


Additional cycles
Cycle 1 Sequencing primer Read 1 Melting and priming.In the first cycle, priming of PCR primers is only from the P1 end due to the forked P2 adapter Table S1.Hair samples analyzed in this study with information about locality data, position of the two traps at each locality (named as "U": upper and "L": lower), adhesive sheet (called "A" and "B" sheets in case of analyzing the two sheets in one trap, otherwise there is only "A" sheet), revision, and species determination.
Table S2.Tissue samples analyzed in this study with information about locality data, collection date (in format dd/mm/yyyy) and sex of European mink obtained from live trapping.Names of captured specimens detected with hair traps are given in parenthesis after the specimen code.The specimen code consists of the sample code of one of the samples from that specimen, adding "IBE-" as prefix.

Specimen code
Table S5.DNA concentration by qPCR, sex determination of 76 European mink hair samples and criteria for selecting samples for ddRAD.The concentration of autosomal and Y chromosome DNA is given in ng/µl.Samples included in the ddRAD libraries are marked with X.Samples that did not amplify during library preparation are denoted as (X).The specimen code represents genotyping results of correctly genotyped samples.Names of the previously captured specimens are given in parenthesis after the specimen code.

Figure S1 .
Figure S1.Barplots showing the proportion of nucleotide substitutions found in the comparison of hair and tissue samples from the same individual (a) before and (b) after removing singletons.(c) Proportion of changes associated to the singletons in the group of the hair samples.

Figure S2 .
Figure S2.Diagram showing the preparation of a ddRAD library of DNA with oxidative damage and a possible mechanism of how oxidative damage, which generates 8-oxoguanine (G*) from guanine, affects the sequencing of the ddRAD library.(a) In this example, there are two damaged bases, one on each DNA strand.(b) Adapters P1 (with the Read 1 primer sequence being attached to the 5' end of the top strand) and P2 (with Read 2 in corresponding position) are ligated in an oriented way.(c) During the PCR cycles, only the top DNA strand is amplified as a result of the forked P2 adapter (Peterson et al. 2012).(d) During sequencing, this results in only the G* present on the bottom strand being reflected in a C-A change in Read 1 sequences, while the G* present on the top strand is not detected by this read.The effect of G* in ddRAD, producing only C-A changes, is opposite to that observed in typical Illumina libraries with oxidized bases, where Read 1 reflects only G-T changes due to the different orientation of the sequencing primers in the library (Chen et al. 2017).

Table S3 .
Mitochondrial primers used in this study for species determination by PCR (primer with '*' was used without modification from García et al. 2017 and the other was slightly modified) and primers used for amplification of an autosomal and a Y chromosome fragment by qPCR of the European mink.The fragment length is given without counting the primers.

Table S4 .
D-loop haplotypes of the European mink samples.They are identical except that haplotype 2 has a gap in the middle of the sequence.

Table S6 .
Basic statistics of the bioinformatic analyses of the ddRAD libraries of the group of 34 hair samples before and after filtering with reads from tissue samples to discard exogenous sequences.

Table S7 .
Sex corroboration results through detection of Y-chromosome sequences within genomic reads.Low-coverage samples (sequenced with less than 100,000 retained reads) could not be properly sex-assigned with this approach and are not represented.